Search CORE

39 research outputs found

On the accuracy of the Viterbi alignment

Author: Kuljus Kristi
Lember Jüri
Publication venue
Publication date: 30/07/2013
Field of study

In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. In this article, an iterative procedure for improving the Viterbi alignment is proposed and studied. The iterative approach is compared with a simple bunch approach where a number of states with low probability are all replaced at the same time. It can be seen that the iterative way of adjusting the Viterbi alignment is more efficient and it has several advantages over the bunch approach. The same iterative algorithm for improving the Viterbi alignment can be used in the case of peeping, that is when it is possible to reveal hidden states. In addition, lower bounds for classification probabilities of the Viterbi alignment under different conditions on the model parameters are studied

arXiv.org e-Print Archive

CiteSeerX

A generalized risk approach to path inference based on hidden Markov models

Author: Koloydenko Alexey A.
Lember Jüri
Publication venue
Publication date: 01/01/2012
Field of study

Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied more recently and in dedicated applications generally unfamiliar to the statistical learning community. Over a decade ago, however, a family of algorithmically defined decoders aiming to hybridize the two standard ones was proposed (Brushe et al., 1998). The present paper gives a careful analysis of this hybridization approach, identifies several problems and issues with it and other previously proposed approaches, and proposes practical resolutions of those. Furthermore, simple modifications of the classical criteria for hidden path recognition are shown to lead to a new class of decoders. Dynamic programming algorithms to compute these decoders in the usual forward-backward manner are presented. A particularly interesting subclass of such estimators can be also viewed as hybrids of the MAP and PD estimators. Similar to previously proposed MAP-PD hybrids, the new class is parameterized by a small number of tunable parameters. Unlike their algorithmic predecessors, the new risk-based decoders are more clearly interpretable, and, most importantly, work "out of the box" in practice, which is demonstrated on some real bioinformatics tasks and data. Some further generalizations and applications are discussed in conclusion.Comment: Section 5: corrected denominators of the scaled beta variables (pp. 27-30), => corrections in claims 1, 3, Prop. 12, bottom of Table 1. Decoder (49), Corol. 14 are generalized to handle 0 probabilities. Notation is more closely aligned with (Bishop, 2006). Details are inserted in eqn-s (43); the positivity assumption in Prop. 11 is explicit. Fixed typing errors in equation (41), Example

arXiv.org e-Print Archive

Royal Holloway Research Online

An evolutionary model that satisfies detailed balance

Author: Lember Jüri
Watkins Chris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2020
Field of study

We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, and, unlike in many other existing models, the stationary distribution -- so called mutation-selection equilibrium -- can be easily found and studied. The behaviour of the stationary distribution when the population size increases is our main object of interest. Several phase-transition theorems are proved.Comment: 38 pages, 5 figure

arXiv.org e-Print Archive

Royal Holloway - Pure

On expected score of cellwise alignments

Author: Klement Riho
Lember Jüri
Publication venue: 'University of Tartu'
Publication date: 03/07/2017
Field of study

We consider certain suboptimal alignments of two independent i.i.d. random sequences from a finite alphabet A = {1;...,K}, both sequences having length n. In particular, we focus on so-called cellwise alignments, where in the first step so many 1-s as possible are aligned. These aligned 1-s define cells and the rest of the alignment is defined so that the already existing alignment of 1-s remains unchanged. We show that as n grows, for any cellwise alignment, the average score of a cell tends to the expected score of a random cell, a.s. Moreover, we show that a large deviation inequality holds. The second part of the paper is devoted to calculating the expected score of certain cellwise alignment referred to as priority letter alignment. In this alignment, inside every cell first all 2-s are aligned. Then all 3-s are aligned, but in such way that the already existing alignment of 2-s remains unchanged. Then we continue with 4-s and so on. Although easy to describe, for K bigger than 3 the exact formula for expected score is not that straightforward to find. We present a recursive formula for calculating the expected score

Journals from University of Tartu